Disagreements in meta-analyses using outcomes measured on continuous or rating scales: observer agreement study

نویسندگان

  • Britta Tendal
  • Julian P T Higgins
  • Peter Jüni
  • Asbjørn Hróbjartsson
  • Sven Trelle
  • Eveline Nüesch
  • Simon Wandel
  • Anders W Jørgensen
  • Katarina Gesser
  • Søren Ilsøe-Kristensen
  • Peter C Gøtzsche
چکیده

OBJECTIVE To study the inter-observer variation related to extraction of continuous and numerical rating scale data from trial reports for use in meta-analyses. DESIGN Observer agreement study. DATA SOURCES A random sample of 10 Cochrane reviews that presented a result as a standardised mean difference (SMD), the protocols for the reviews and the trial reports (n=45) were retrieved. DATA EXTRACTION Five experienced methodologists and five PhD students independently extracted data from the trial reports for calculation of the first SMD result in each review. The observers did not have access to the reviews but to the protocols, where the relevant outcome was highlighted. The agreement was analysed at both trial and meta-analysis level, pairing the observers in all possible ways (45 pairs, yielding 2025 pairs of trials and 450 pairs of meta-analyses). Agreement was defined as SMDs that differed less than 0.1 in their point estimates or confidence intervals. RESULTS The agreement was 53% at trial level and 31% at meta-analysis level. Including all pairs, the median disagreement was SMD=0.22 (interquartile range 0.07-0.61). The experts agreed somewhat more than the PhD students at trial level (61% v 46%), but not at meta-analysis level. Important reasons for disagreement were differences in selection of time points, scales, control groups, and type of calculations; whether to include a trial in the meta-analysis; and data extraction errors made by the observers. In 14 out of the 100 SMDs calculated at the meta-analysis level, individual observers reached different conclusions than the originally published review. CONCLUSIONS Disagreements were common and often larger than the effect of commonly used treatments. Meta-analyses using SMDs are prone to observer variation and should be interpreted with caution. The reliability of meta-analyses might be improved by having more detailed review protocols, more than one observer, and statistical expertise.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lack of interchangeability between visual analogue and verbal rating pain scales: a cross sectional description of pain etiology groups

BACKGROUND Rating scales like the visual analogue scale, VAS, and the verbal rating scale, VRS, are often used for pain assessments both in clinical work and in research, despite the lack of a gold standard. Interchangeability of recorded pain intensity captured in the two scales has been discussed earlier, but not in conjunction with taking the influence of pain etiology into consideration. ...

متن کامل

Developing an Analytic Scale for Scoring EFL Descriptive Writing

English language practitioners have long relied on intuition-based scales for rating EFL/ESL writing. As these scales lack an empirical basis, the scores they generate tend to be unreliable, which results in invalid interpretations. Given the significance of the genre of description and the fact that the relevant literature does not introduce any data-based analytic scales for rating EFL descri...

متن کامل

SEQUENTIAL PENALTY HANDLING TECHNIQUES FOR SIZING DESIGN OF PIN-JOINTED STRUCTURES BY OBSERVER-TEACHER-LEARNER-BASED OPTIMIZATION

Despite comprehensive literature works on developing fitness-based optimization algorithms, their performance is yet challenged by constraint handling in various engineering tasks. The present study, concerns the widely-used external penalty technique for sizing design of pin-jointed structures. Observer-teacher-learner-based optimization is employed here since previously addressed by a number ...

متن کامل

Developing Rating Scale Descriptors for Assessing the Stages of Writing Process: The Constructs Underlying Students' Writing Performances

The purpose of the present study is to develop appropriate scoring scales for each of the defined stages of the writing process, and also to determine to what extent these scoring scales can reliably and validly assess the performances of EFL learners in an academic writing task. Two hundred and two students’ writing samples were collected after a step-by-step process oriented essay writing ins...

متن کامل

improving interrater agreement about brain microbleeds: development of the Brain Observer MicroBleed Scale (BOMBS).

BACKGROUND AND PURPOSE If the diagnostic and prognostic significance of brain microbleeds (BMBs) are to be investigated and used for these purposes in clinical practice, observer variation in BMB assessment must be minimized. METHODS Two doctors used a pilot rating scale to describe the number and distribution of BMBs (round, low-signal lesions, <10 mm diameter on gradient echo MRI) among 264...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 339  شماره 

صفحات  -

تاریخ انتشار 2009